Profiling

WOLF performs profiling of its main processes. This is important for assessing the developer to evaluate the efficency of the algorithms. Also it may help the end-user to properly configure the problem.

ProfilingUnit

The class ProfilingUnit is the responsible of computing the CPU time of the different WOLF processes.

class ProfilingUnit
{
  protected:
    bool                      running_;            ///< True if profiling is running.
    double                    desired_period_us_;  ///< Desired period in microseconds (< 0 for not specified)
    unsigned int              n_callbacks_;        ///< Number of callbacks.
    double                    average_period_us_;  ///< Average period in microseconds.
    std::chrono::microseconds acc_duration_us_;    ///< Accumulated duration in microseconds.
    std::chrono::microseconds max_duration_us_;    ///< Maximum duration in microseconds.
    double                    var_duration_us_;    ///< Variance of the duration in microseconds squared.
    std::chrono::time_point<std::chrono::high_resolution_clock> first_clock_;  ///< Time point of the first clock.
    std::chrono::time_point<std::chrono::high_resolution_clock> start_clock_;  ///< Time point of the start clock.

  public:
    /**
     * @brief Constructor
     * @param desired_period_sec Desired period in seconds. Negative value means not specified. Default is -1.
     */
    ProfilingUnit(double desired_period_sec = -1);

    /**
     * @brief Start profiling.
     */
    void startProfiling();

    /**
     * @brief Stop profiling.
     */
    void stopProfiling();

    /**
     * @brief Print profiling results.
     * @param _tabs Tabs to be used for indentation.
     * @param stream Output stream to print the results. Default is std::cout.
     */
    void printProfiling(const std::string& _tabs, std::ostream& stream = std::cout) const;

    /**
     * @brief Print provided duration (in us) in the most appropiate units with specified precision.
     * @param _duration_microsec Duration in microseconds.
     * @param precision Precision for printing the duration. Default is 2.
     * @return A string representing the duration.
     */
    std::string printDuration(double _duration_microsec, int precision = 2) const;

    /**
     * @brief Set the desired period.
     * @param _desired_period_sec Desired period in seconds.
     */
    void setDesiredPeriod(double _desired_period_sec);

    /**
     * @brief Check if profiling is running.
     * @return True if profiling is running, false otherwise.
     */
    bool isRunning() const;

    /**
     * @brief Get the number of callbacks.
     * @return Number of callbacks.
     */
    unsigned int getNCallbacks() const;

    /**
     * @brief Get the desired period.
     * @return Desired period in seconds.
     */
    double getDesiredPeriod() const;

    /**
     * @brief Get the average period.
     * @return Average period in seconds.
     */
    double getAveragePeriod() const;

    /**
     * @brief Get the accumulated duration.
     * @return Accumulated duration in seconds.
     */
    double getAccDuration() const;

    /**
     * @brief Get the average duration.
     * @return Average duration in seconds.
     */
    double getAvgDuration() const;

    /**
     * @brief Get the maximum duration.
     * @return Maximum duration in seconds.
     */
    double getMaxDuration() const;

    /**
     * @brief Get the variance of the duration.
     * @return Variance of the duration in seconds squared.
     */
    double getVarDuration() const;

    /**
     * @brief Get the standard deviation of the duration.
     * @return Standard deviation of the duration in seconds.
     */
    double getStDevDuration() const;
};

If the process that is being profiled has a desired period, it can be specified to the profiler unit via constructor or setter.

A brief profiling report can be obtained using ProfilingUnit::printProfiling(const std::string& _tabs, std::ostream& stream = std::cout) which will print the following information:

  • Amount of instances.

  • Total time.

  • Desired period (if specified).

  • Average period.

  • Average time.

  • St. Dev. of time.

  • Max. time

The following WOLF nodes have profiling units:

  • ProcessorBase: 2 units

    • Capture processing

    • Frame processing

  • SolverManager: 5 units

    • Total solving time

      • Update problem

      • Solve

      • Update state blocks

    • Covariance computing time

  • Publisher: 1 unit

All this classes have the virtual method printProfiling(const std::string& _tabs, std::ostream& stream = std::cout). More profiling units can be added to derived classes (requires overriding this method) to help to analyze specific algorithm processes. (This is the case, for example, of ProcessorTracker).

The ROS2 node, at the end of execution, will create a profiling report including information of profiling units listed above.

void Node::createProfilingFile()
{
    if (not profiling_) return;

    std::stringstream profiling_str;
    profiling_str << "========== WOLF PROFILING ==========\n";
    auto duration = std::chrono::duration_cast<std::chrono::microseconds>(std::chrono::high_resolution_clock::now() -
                                                                          start_experiment_);
    profiling_str << "Experiment total time: " << 1e-6 * duration.count() << " s" << std::endl;

    // solver
    profiling_str << "\nSOLVER -----------------------------\n";
    solver_->printProfiling(profiling_str);

    // processors
    profiling_str << "\nPROCESSORS -------------------------\n";
    for (auto sensor : problem_ptr_->getHardware()->getSensorList())
        for (auto proc : sensor->getProcessorList()) proc->printProfiling(profiling_str);

    // publishers
    profiling_str << "\nPUBLISHERS -------------------------\n";
    for (auto pub : publishers_) pub->printProfiling(profiling_str);

    profiling_str << "\n";

    // print
    std::cout << profiling_str.str();

    // file
    if (profiling_file_.is_open())
    {
        profiling_file_ << profiling_str.str();
        profiling_file_.close();
    }
}

As it can be observed, it will also create a .txt file with the same content. Here is an example of the printed message:

========== WOLF PROFILING ==========
Experiment total time: 59.0594 s

SOLVER -----------------------------

Solver Manager:

      Solving total (update problem + solve + update state):
              instances:      59
              total time:     234.63 ms
              desired period: 1.00 s (reference)
              avg period:     1.00 s
              avg time:       3.98 ms (0%)
              stdev time:     3.69 ms (0%)
              max time:       19.96 ms (2%)
      Update problem:
                      instances:      59
                      total time:     26.31 ms
                      desired period: 1.00 s (reference)
                      avg period:     1.00 s
                      avg time:       445.86 us (0%)
                      stdev time:     224.15 us (0%)
                      max time:       1.06 ms (0%)
      Solve:
                      instances:      59
                      total time:     207.22 ms
                      desired period: 1.00 s (reference)
                      avg period:     1.00 s
                      avg time:       3.51 ms (0%)
                      stdev time:     3.53 ms (0%)
                      max time:       19.03 ms (2%)
      Update state:
                      instances:      59
                      total time:     638.00 us
                      desired period: 1.00 s (reference)
                      avg period:     1.00 s
                      avg time:       10.81 us (0%)
                      stdev time:     5.05 us (0%)
                      max time:       29.00 us (0%)
      NOT computing covariance
Iterations:
      User-defined limit:         10
      Average iterations:         1
      Max. iterations:            6
Termination:
      Convergence:                   100 %
      Interrupted by problem change: 0 %
      Max. iterations reached:       0 %

PROCESSORS -------------------------

ProcessorOdom2d - processorodom:
      total time:             1.01 s
      Processing captures:
              instances:      2916
              total time:     1.01 s
              avg period:     19.95 ms (reference)
              avg time:       346.41 us (2%)
              stdev time:     1.33 ms (7%)
              max time:       25.12 ms (126%)
      Processing keyframes:
              instances:      13
              total time:     45.00 us
              avg period:     4.55 s (reference)
              avg time:       3.46 us (0%)
              stdev time:     1.95 us (0%)
              max time:       10.00 us (0%)

ProcessorOdomIcp - processorodomicp:
      total time:             23.42 s
      Processing captures:
              instances:      1762
              total time:     23.42 s
              avg period:     33.17 ms (reference)
              avg time:       13.29 ms (40%)
              stdev time:     2.11 ms (6%)
              max time:       36.99 ms (112%)
      Processing keyframes:
              instances:      2
              total time:     9.00 us
              avg period:     44.43 ms (reference)
              avg time:       4.50 us (0%)
              stdev time:     3.50 us (0%)
              max time:       8.00 us (0%)

      preProcess:
              instances:      1750
              total time:     94.20 ms
              avg period:     33.40 ms (reference)
              avg time:       53.83 us (0%)
              stdev time:     18.40 us (0%)
              max time:       197.00 us (1%)
      processKnown:
              instances:      1750
              total time:     22.48 s
              avg period:     33.40 ms (reference)
              avg time:       12.85 ms (38%)
              stdev time:     1.18 ms (4%)
              max time:       22.10 ms (66%)
      processNew:
              instances:      14
              total time:     197.44 ms
              avg period:     4.19 s (reference)
              avg time:       14.10 ms (0%)
              stdev time:     1.00 ms (0%)
              max time:       17.29 ms (0%)
      establishFactors:
              instances:      14
              total time:     11.52 ms
              avg period:     4.19 s (reference)
              avg time:       822.50 us (0%)
              stdev time:     743.39 us (0%)
              max time:       3.41 ms (0%)
      postProcess:
              instances:      1750
              total time:     1.00 us
              avg period:     33.41 ms (reference)
              avg time:       0.00 us (0%)
              stdev time:     0.02 us (0%)
              max time:       1.00 us (0%)

ProcessorLoopClosureIcp - processorloopclosureicp:
      total time:             1.37 s
      Processing captures:
              instances:      1750
              total time:     79.96 ms
              avg period:     33.41 ms (reference)
              avg time:       45.69 us (0%)
              stdev time:     11.86 us (0%)
              max time:       158.00 us (0%)
      Processing keyframes:
              instances:      14
              total time:     1.29 s
              avg period:     4.20 s (reference)
              avg time:       92.13 ms (2%)
              stdev time:     89.94 ms (2%)
              max time:       322.83 ms (8%)

PUBLISHERS -------------------------

PublisherTf ( ):
      instances:      586
      total time:     166.31 ms
      desired period: 100.00 ms (reference)
      avg period:     100.00 ms
      avg time:       283.81 us (0%)
      stdev time:     78.57 us (0%)
      max time:       742.00 us (1%)
PublisherGraph (graph):
      instances:      59
      total time:     66.45 ms
      desired period: 1.00 s (reference)
      avg period:     1.00 s
      avg time:       1.13 ms (0%)
      stdev time:     556.74 us (0%)
      max time:       2.63 ms (0%)
PublisherLaserMap (map):
      instances:      59
      total time:     1.07 s
      desired period: 1.00 s (reference)
      avg period:     1.00 s
      avg time:       18.09 ms (2%)
      stdev time:     24.76 ms (2%)
      max time:       146.62 ms (15%)